This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code. This notebook replicates results in \(\textbf{New tools for network time series with an application to COVID-19 hospitalisations}\), and produces further Corbit and R-Corbit plots examples; see https://arxiv.org/abs/2312.00530.
Simulate realisations of differing length coming from the stationary global-\(\alpha\) GNAR model below
\[ \begin{equation} \label{eq:vector wise representation} \boldsymbol{X}_{t} = \sum_{k = 1}^{p} \left (\alpha_{k} \boldsymbol{X}_{t - k} + \sum_{r = 1}^{s_k} \beta_{kr} \boldsymbol{Z}_{r, t - k} \right ) + \boldsymbol{u}_{t} , \end{equation} \] We use the \(\mathbf{fiveNet}\), included in the CRAN \(\texttt{GNAR}\) package, network throughout this notebook for producing all relevant figures.
We begin by loading the GNAR package into our session in \(\texttt{R}\).
##
## Attaching package: 'igraph'
## The following objects are masked from 'package:stats':
##
## decompose, spectrum
## The following object is masked from 'package:base':
##
## union
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
## Loading required package: viridisLite
##
## Attaching package: 'ggpubr'
## The following object is masked from 'package:forecast':
##
## gghistogram
We simulate a realisation of length one-hundred by using the function \(\texttt{GNARsim}\). The code for producing the first simulation also shows how to compute the \(r\)-stage adjacency matrices and the weights matrix for the \(\mathbf{fiveNet}\) network.
n_sims = 100
# Compute the r-stage adjacency matrices
stages_tensor = get_k_stages_adjacency_tensor(
as.matrix(GNARtoigraph(fiveNet)),
3)
# Compute the weights-matrix for the fiveNet network
W_normalised = weights_matrix(fiveNet, 3)
# First Simulation GNAR(1, [1])
sim1 <- GNARsim(n = n_sims, net=fiveNet,
alphaParams = list(rep(0.47, 5)),
betaParams = list(c(0.46)))
We proceed to compute a Corbit plot for the simulation produced above.
m_lag = 20
m_stage = 5
corbit_plot(vts = sim1, net = fiveNet, max_lag = m_lag, max_stage = 3, viridis_color_option = 'cividis')
Figure 1: Corbit plot for a realisation of length 100 coming from a stationary global-alpha GNAR(1, [1]) process.
The Corbit plot above uses the \(\texttt{cividis}\) colour scale, however, users can specify any colour scale supported by the \(\texttt{viridis}\) package (all of which are colour-blind friendly). Note that the Corbit plot can quickly aid analysts for detecting seasonality and trend in an observed realisation of a (network) time series. We incorporate a seasonal term as a cosine wave of period four into simulation one and produce the new Corbit plot.
m_lag = 20
m_stage = 5
sim1_seasonal = vapply(c(1:5), function(x) {sim1[, x] + 2 * (1 + cos(c(1:n_sims) * pi / 2)) }, rep(0, n_sims))
corbit_plot(vts = sim1_seasonal, net = fiveNet, max_lag = m_lag, max_stage = 3, viridis_color_option = 'cividis')
Figure 2: Corbit plot for a realisation of length 100 coming from a seasonal global-alpha GNAR(1, [1]) process.
The Corbit plot in Figure 2 successfully reflects the seasonal period every four lags. We incorporate a linear trend into Simulation 1 below, and analyse the resulting Corbit plot below.
m_lag = 20
m_stage = 5
sim1_trend = vapply(c(1:5), function(x) {sim1[, x] + c(1:n_sims) }, rep(0, n_sims))
corbit_plot(vts = sim1_trend, net = fiveNet, max_lag = m_lag, max_stage = 3, viridis_color_option = 'cividis')
Figure 3: Corbit plot for a realisation of length 100 coming from a global-alpha GNAR(1, [1]) process plus a linear trend.
Finally, we incorporate both a trend and seasonal component into Simulation 1, and analyse the resulting Corbit plot.
m_lag = 20
m_stage = 5
sim1_trend_seasonal = vapply(c(1:5), function(x) {sim1[, x] + 2 * (1 + cos(c(1:n_sims) * pi / 2)) + c(1:n_sims) / 10 }, rep(0, n_sims))
corbit_plot(vts = sim1_trend_seasonal, net = fiveNet, max_lag = m_lag, max_stage = 3, viridis_color_option = 'cividis')
Figure 3: Corbit plot for a realisation of length 100 coming from a global-alpha GNAR(1, [1]) process plus a linear trend.
We simulate a realisation of length one-thousand coming from a stationary global-\(\alpha\) GNAR\((3, [2, 1, 0])\) process. Subsequently, we compute the mean value of the NACF and PNACF at each iteration.
# Simulate a realisation of length 100 coming from a stationary GNAR(2, [1, 2])
nsims_2 = 1000
sim2 <- GNARsim(n = nsims_2, net=fiveNet,
alphaParams = list(rep(0.12, 5), rep(0.18, 5)),
betaParams = list(c(0.21), c(-0.32, 0.12)))
# Compute the corresponding Corbit plot for the PNACF
corbit_plot(vts = sim2, net = fiveNet, max_lag = m_lag, max_stage = 3, viridis_color_option = 'turbo', partial = "yes")
We examine another model.
# Simulate a realisation of length 100 coming from a stationary GNAR(2, [1, 2])
nsims = 100
sim2 <- GNARsim(n = nsims, net=fiveNet,
alphaParams = list(rep(0.21, 5), rep(0.18, 5)),
betaParams = list(c(0.12), c(0.32, 0.12)))
# Compute the corresponding Corbit plot for the PNACF
corbit_plot(vts = sim2, net = fiveNet, max_lag = m_lag, max_stage = 3, viridis_color_option = 'viridis', partial = "yes")
We examine a basic GNAR model
# Simulate a realisation of length 100 coming from a stationary GNAR(2, [1, 2])
nsims = 100
sim2 <- GNARsim(n = nsims, net=fiveNet,
alphaParams = list(rep(0.35, 5), rep(0.12, 5)),
betaParams = list(c(0.24), c(0.17)))
# Compute the corresponding Corbit plot for the PNACF
corbit_plot(vts = sim2, net = fiveNet, max_lag = m_lag, max_stage = 3, viridis_color_option = 'viridis', partial = "yes")
Corbit plots can be produced in alternative ways that naturally handle missing values. We present these alternatives in this subsection.
We start with a Corbit plot that includes a line segment between the first and last lags. It is also possible to print \(\mathrm{(p)nacf}\) values by either printing the call or by assigning an object to it (\(\texttt{corbit_plot_edt}\) returns an invisble matrix containing these values). This is done by changing the option \(\texttt{line_segment = "yes"}\) in a \(\texttt{corbit_plot_edt}\) call. The data used for producing the Corbit plots below is the \(\mathbf{gdpVTS}\) (i.e., GDP vector time series) included in the \(\texttt{GNAR}\) package.
print(corbit_plot_edt(vts = gdpVTS, net = igraphtoGNAR(net_result), max_lag = 10, max_stage = max_stage_nacf, weight_matrix = W0, line_segment = "yes"))
## Warning in missing_data_warning(missing_data = (sum(is.na(vts)) != 0)): Missing
## values detected, continuing by adjusting the weights accordingly.
## r-stage: 1 r-stage: 2 r-stage: 3
## [1,] -0.225832886 -0.178778077 -1.611198e-01
## [2,] -0.225075748 -0.218256718 -1.390373e-01
## [3,] 0.045703808 0.043935240 2.404970e-02
## [4,] -0.018711111 -0.014943649 -1.955781e-02
## [5,] -0.043011145 -0.040858181 -1.482430e-02
## [6,] 0.040674032 0.034314219 1.909578e-02
## [7,] 0.007962979 0.008580029 1.848319e-05
## [8,] 0.011112296 0.018858966 4.138276e-03
## [9,] -0.040526609 -0.048194040 -1.514524e-02
## [10,] 0.084719447 0.071861991 5.012933e-02
We can produce a rectangular Corbit plot by setting \(\texttt{rectangular_plot = "yes"}\).
print(corbit_plot_edt(vts = gdpVTS, net = igraphtoGNAR(net_result), max_lag = 10, max_stage = max_stage_nacf, weight_matrix = W0, rectangular_plot="yes"))
## Warning in missing_data_warning(missing_data = (sum(is.na(vts)) != 0)): Missing
## values detected, continuing by adjusting the weights accordingly.
## r-stage: 1 r-stage: 2 r-stage: 3
## [1,] -0.225832886 -0.178778077 -1.611198e-01
## [2,] -0.225075748 -0.218256718 -1.390373e-01
## [3,] 0.045703808 0.043935240 2.404970e-02
## [4,] -0.018711111 -0.014943649 -1.955781e-02
## [5,] -0.043011145 -0.040858181 -1.482430e-02
## [6,] 0.040674032 0.034314219 1.909578e-02
## [7,] 0.007962979 0.008580029 1.848319e-05
## [8,] 0.011112296 0.018858966 4.138276e-03
## [9,] -0.040526609 -0.048194040 -1.514524e-02
## [10,] 0.084719447 0.071861991 5.012933e-02
Finally, setting \(\texttt{rectangular_plot = "square"}\) produces a square Corbit plot.
print(corbit_plot_edt(vts = gdpVTS, net = igraphtoGNAR(net_result), max_lag = 10, max_stage = max_stage_nacf, weight_matrix = W0, rectangular_plot="square"))
## Warning in missing_data_warning(missing_data = (sum(is.na(vts)) != 0)): Missing
## values detected, continuing by adjusting the weights accordingly.
## r-stage: 1 r-stage: 2 r-stage: 3
## [1,] -0.225832886 -0.178778077 -1.611198e-01
## [2,] -0.225075748 -0.218256718 -1.390373e-01
## [3,] 0.045703808 0.043935240 2.404970e-02
## [4,] -0.018711111 -0.014943649 -1.955781e-02
## [5,] -0.043011145 -0.040858181 -1.482430e-02
## [6,] 0.040674032 0.034314219 1.909578e-02
## [7,] 0.007962979 0.008580029 1.848319e-05
## [8,] 0.011112296 0.018858966 4.138276e-03
## [9,] -0.040526609 -0.048194040 -1.514524e-02
## [10,] 0.084719447 0.071861991 5.012933e-02
R-Corbit plots allow quick comparison of network correlation structures for different time-slices and/or covariates (e.g., labels). We produce an example comparing three distinct GNAR simulations below.
# Initialise values
m_lag = 20
m_stage = 3
n_sims = 200
set.seed(2023)
# Produce GNAR simulations
sim1 <- GNARsim(n = n_sims, net=fiveNet, alphaParams = list(c(0.1, 0.12, 0.16, 0.075, 0.21),
c(0.12, 0.14, 0.15, 0.3, 0.22)),
betaParams = list(c(0.16, 0.10), c(0.14, 0.11)))
sim2 <- GNARsim(n = n_sims, net=fiveNet, alphaParams = list(rep(0.23, 5), rep(0.11, 5)),
betaParams = list(c(0.21), c(0.12)))
sim3 <- GNARsim(n = n_sims, net=fiveNet, alphaParams = list(rep(0.12, 5), rep(0.10, 5)),
betaParams = list(c(0.25, 0.27), c(0.18)))
# Produce R-Corbot plots
R_corbit_plot(list(sim1, sim2, sim3), list(fiveNet), m_lag, m_stage, frame_names = c("GNAR(1, [2, 2]) simulation.", "GNAR(1, [1]) simulation.", "GNAR(1, [2]) simulation."), same_net = "yes")
R_corbit_plot(list(sim1, sim2, sim3), list(fiveNet), m_lag, m_stage, frame_names = c("GNAR(1, [2, 2]) simulation.", "GNAR(1, [1]) simulation.", "GNAR(1, [2]) simulation."), same_net = "yes", partial = "yes")
We start by loading the NHS trust network time series (included in the \(\texttt{GNAR}\) package). The network time series is a realisation of length four-hundred-and-fifty-two (452) that corresponds to the logarithm of the daily number of patients occupying mechanical ventilation beds, i.e., \(\log (1 + Y_{i, t})\), where \(Y_{i, t}\) is the number of patients occupying mechanical ventilation beds in NHS trust \(i = 1, \dots, 140\) on day \(t = 1, \dots, 452\).
# Preliminary objects
# load the NHS trust network time series
covid_data = logMVbedMVC.vts
# Compute the weight matrix for the NHS trusts network
W_nhs_trust = weights_matrix(NHSTrustMVCAug120.net, 1)
We visualise the evolution of the series by plotting the sum of the total number of patients transferred to mechanical ventilation beds on each day. Further, we compute the mean and standard deviaiton up to day \(t\) for exploratory data analysis.
sum_series = as.ts(rowSums(logMVbedMVC.vts))
plot(c(1:452), sum_series, 'l', xlab = 't', main = expression(paste('Plot of rolling sum of all NHS time-series: ', Y[t] == sum(X["i, t"], i == 1, 140))),
ylab = expression(Y[t]), col = 'blue')
plot(c(1:452), sum_series / 140, 'l', xlab = 't', main = expression(paste('Plot of rolling incremental mean occupancy of all NHS time-series: ', Y[t] == sum(X["i, t"], i == 1, 140) / 140)),
ylab = expression(Y[t]), col = 'blue')
# compute mean and standard deviation up top day t
sum_series_mean = vapply(c(1:length(sum_series)), function(x) {mean(sum_series[1:x])}, 0)
sum_series_sd = vapply(c(1:length(sum_series)), function(x) {sd(sum_series[1:x])}, 0)
sd_time_series_values = vapply(c(1:452), function(x) {sd(logMVbedMVC.vts[x, ])}, 0.0)
# plot of rolling standard deviation of all NHS trusts at each time point, i.e., sd(X_{i, t})
plot(c(1:452), sum_series_mean, 'l', xlab = 't', main = expression(paste('Plot of rolling incremental mean sum of all NHS time-series: ', Y[t] == sum(X["i, t"], i == 1, 140))),
ylab = expression(mean(Y[t])), col = 'blue')
plot(c(1:452), sd_time_series_values, 'l', xlab = 't', main = expression(paste('Plot of rolling standard deviation of joint time-series: ', Y[t] == sd(X["i, t"]), ", where i = 1 to 140")),
ylab = expression(Y[t]), col = 'blue')
plot(c(1:452), sum_series_sd, 'l', xlab = 't', main = expression(paste('Plot of rolling incremental standard deviation of the sum consisting of all NHS time-series: ', Y[t] == sum(X["i, t"], i == 1, 140))),
ylab = expression(sd(Y[t])), col = 'blue')
We analyse the network autocorrelation structure by analysing the Corbit plots below. First, we produce the NACF Corbit plot.
corbit_plot(covid_data, NHSTrustMVCAug120.net, 20, 6)
Subsequently, we analyse the PNACF Corbit plot, which aids model selection and visualising dependence in the data.
corbit_plot(covid_data, NHSTrustMVCAug120.net, 10, 6, partial="yes")
The Corbit plots above suggest that network autocorrelation cuts-off after the first lag, further, that one-stage neighbours appear to have a stronger correlation than higher-order \(r\)-stage neighbours. These Corbit plots suggest comparing the following models.
# Create a table for storing model diagnostic statistics
GNAR_model_comparison_table <- array(dim = c(6, 5))
colnames(GNAR_model_comparison_table) <- c("#(parameters)", " One-Step SE ", " Two-Step SE ", "AIC", "BIC")
We compare GNAR fits for the time-steps selected below.
covid_data_set = covid_data[1:447, ]
test_set = covid_data[448:452, ]
The code for fitting a GNAR\((1, [1])\) is
# GNAR(1, [1])
# model order specification
plag = 1
stages = c(1)
# fitting the model
covid_fit <- GNARfit(covid_data_set, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
# Number of parameters
GNAR_model_comparison_table[1, 1] = length(covid_fit$mod$coefficients)
GNAR_model_comparison_table[1, 4] = AIC(covid_fit)
GNAR_model_comparison_table[1, 5] = BIC(covid_fit)
# one-step ahead forecast
one_step = predict(covid_fit)
GNAR_model_comparison_table[1, 2] = sum((one_step - test_set[1, ])^2)
# two-step ahead forecast
new_data = rbind(covid_data_set, one_step)
new_fit = GNARfit(new_data, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
GNAR_model_comparison_table[1, 3] = sum((predict(new_fit)- test_set[2, ])^2)
The code for fitting a GNAR\((1, [6])\) is
# GNAR(1, [6])
# model order specification
plag = 1
stages = c(6)
# fitting the model
covid_fit <- GNARfit(covid_data_set, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
# Number of parameters
GNAR_model_comparison_table[2, 1] = length(covid_fit$mod$coefficients)
GNAR_model_comparison_table[2, 4] = AIC(covid_fit)
GNAR_model_comparison_table[2, 5] = BIC(covid_fit)
# one-step ahead forecast
one_step = predict(covid_fit)
GNAR_model_comparison_table[2, 2] = sum((one_step - test_set[1, ])^2)
# two-step ahead forecast
new_data = rbind(covid_data_set, one_step)
new_fit = GNARfit(new_data, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
GNAR_model_comparison_table[2, 3] = sum((predict(new_fit)- test_set[2, ])^2)
The code for fitting a GNAR\((2, [6, 6])\) is
# GNAR(2, [6, 6])
# model order specification
plag = 2
stages = c(6, 6)
# fitting the model
covid_fit <- GNARfit(covid_data_set, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
# Number of parameters
GNAR_model_comparison_table[3, 1] = length(covid_fit$mod$coefficients)
GNAR_model_comparison_table[3, 4] = AIC(covid_fit)
GNAR_model_comparison_table[3, 5] = BIC(covid_fit)
# one-step ahead forecast
one_step = predict(covid_fit)
GNAR_model_comparison_table[3, 2] = sum((one_step - test_set[1, ])^2)
# two-step ahead forecast
new_data = rbind(covid_data_set, one_step)
new_fit = GNARfit(new_data, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
GNAR_model_comparison_table[3, 3] = sum((predict(new_fit)- test_set[2, ])^2)
The code for fitting a GNAR\((6, [6^6])\) is
# GNAR(6, [6, ..., 6])
# model order specification
plag = 6
stages = rep(6, 6)
# fitting the model
covid_fit <- GNARfit(covid_data_set, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
# Number of parameters
GNAR_model_comparison_table[4, 1] = length(covid_fit$mod$coefficients)
GNAR_model_comparison_table[4, 4] = AIC(covid_fit)
GNAR_model_comparison_table[4, 5] = BIC(covid_fit)
# one-step ahead forecast
one_step = predict(covid_fit)
GNAR_model_comparison_table[4, 2] = sum((one_step - test_set[1, ])^2)
# two-step ahead forecast
new_data = rbind(covid_data_set, one_step)
new_fit = GNARfit(new_data, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
GNAR_model_comparison_table[4, 3] = sum((predict(new_fit)- test_set[2, ])^2)
The code for fitting a GNAR\((1, [0])\) is
# GNAR(1, [0])
# model order specification
plag = 1
stages = c(0)
# fitting the model
covid_fit <- GNARfit(covid_data_set, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
# Number of parameters
GNAR_model_comparison_table[5, 1] = length(covid_fit$mod$coefficients)
GNAR_model_comparison_table[5, 4] = AIC(covid_fit)
GNAR_model_comparison_table[5, 5] = BIC(covid_fit)
# one-step ahead forecast
one_step = predict(covid_fit)
GNAR_model_comparison_table[5, 2] = sum((one_step - test_set[1, ])^2)
# two-step ahead forecast
new_data = rbind(covid_data_set, one_step)
new_fit = GNARfit(new_data, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
GNAR_model_comparison_table[5, 3] = sum((predict(new_fit)- test_set[2, ])^2)
The code for fitting a GNAR\((10, [0])\) is
# GNAR(10, [0])
# model order specification
plag = 10
stages = rep(0, 10)
# fitting the model
covid_fit <- GNARfit(covid_data_set, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
# Number of parameters
GNAR_model_comparison_table[6, 1] = length(covid_fit$mod$coefficients)
GNAR_model_comparison_table[6, 4] = AIC(covid_fit)
GNAR_model_comparison_table[6, 5] = BIC(covid_fit)
# one-step ahead forecast
one_step = predict(covid_fit)
GNAR_model_comparison_table[6, 2] = sum((one_step - test_set[1, ])^2)
# two-step ahead forecast
new_data = rbind(covid_data_set, one_step)
new_fit = GNARfit(new_data, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
GNAR_model_comparison_table[6, 3] = sum((predict(new_fit)- test_set[2, ])^2)
We compare the models by printing the following table.
| #(parameters) | One-Step SE | Two-Step SE | AIC | BIC | |
|---|---|---|---|---|---|
| GNAR(1, [1]) | 2 | 5.088 | 8.808 | -452.393 | -452.375 |
| GNAR(1, [6]) | 7 | 5.064 | 8.799 | -452.532 | -452.468 |
| GNAR(2, [6, 6]) | 14 | 5.166 | 8.218 | -458.883 | -458.755 |
| GNAR(6, [6,:6]) | 42 | 5.238 | 9.882 | -467.350 | -466.964 |
| GNAR(1, [0]) | 1 | 5.030 | 8.845 | -451.739 | -451.730 |
| GNAR(10, [0]) | 10 | 5.027 | 8.654 | -472.229 | -472.137 |
We compare a GNAR\((1, [1])\) with VAR and sparse VAR models. We begin by loading the required libraries.
library(forecast)
library(vars)
## Loading required package: MASS
## Loading required package: strucchange
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
## Loading required package: sandwich
## Loading required package: urca
## Loading required package: lmtest
library(sparsevar)
##
## Attaching package: 'sparsevar'
## The following object is masked from 'package:forecast':
##
## accuracy
We perform one-step ahead prediction ten times. The starting time-step and error vectors are initialised below.
# number of one-step ahead predictions
# dummt counter variable
k = 0
t_steps = 10
# starting point
covid_data = logMVbedMVC.vts
initial_start = nrow(covid_data) - t_steps
decimal_digits = 3
# AR model prediction error
ar_pe = rep(0, t_steps)
# global-alpha GNAR model prediction error
gnar_pe = rep(0, t_steps)
# local-alpha GNAR model prediction error
gnar_no_global_pe = rep(0, t_steps)
# VAR prediction error
var_pe = rep(0, t_steps)
# restricted VAR prediction error
res_var_pe = rep(0, t_steps)
# sparse VAR prediction error
sparse_var_pe = rep(0, t_steps)
We perform one-ahead forecasting ten times and save the squared error.
for (i in 1:t_steps) {
# initialise the comparison table
output <- matrix(rep(0, 18), nrow = 6, ncol = 3)
train_end = initial_start + k
test_point = train_end + 1
# We compare the GNAR models in this script
# 140 AR(1) models
arforecast <- apply(covid_data[1:train_end, ], 2, function(x){
forecast(auto.arima(x, d = 0, D = 0, max.p = 1, max.q = 0,
max.P = 0, max.Q = 0, stationary = TRUE, seasonal = FALSE, ic = 'bic',
allowmean = FALSE, allowdrift = FALSE, trace = FALSE), h = 1)$mean
})
output[5, 2] = "$140$"
output[5, 1] = "AR(1)"
output[5, 3] = round(sum((arforecast - covid_data[test_point, ])^2), digits = decimal_digits)
ar_pe[i] = output[5, 3]
# GNAR(1, [1]) global-alpha fit
covid_fit <- GNARfit(covid_data[1:train_end, ], net = NHSTrustMVCAug120.net, alphaOrder = 1, betaOrder = rep(1, 1))
output[1, 2] = 2
output[1, 3] = round(sum((predict(covid_fit) - covid_data[test_point, ])^2), digits = decimal_digits)
gnar_pe[i] = output[1, 3]
output[1, 1] = "GNAR(1, [1])"
# GNAR(1, [1]) local-alpha fit
covid_fit <- GNARfit(covid_data[1:train_end, ], net = NHSTrustMVCAug120.net, alphaOrder = 1, betaOrder = rep(1, 1), globalalpha=FALSE)
output[6, 2] = 141
output[6, 3] = round(sum((predict(covid_fit) - covid_data[test_point, ])^2), digits = decimal_digits)
gnar_no_global_pe[i] = output[6, 3]
output[6, 1] = "GNAR(1, [1])*"
# Restricted VAR(1) model
varforecast <- predict(restrict(VAR(covid_data[1:train_end, ], p = 1, type = 'none')), n.ahead = 1)
get_var_fcst <- function(x){return (x[1])}
var_point_fcst <- unlist(lapply(varforecast$fcst, get_var_fcst))
output[3, 3] = round(sum((var_point_fcst - covid_data[test_point, ])^2), digits = decimal_digits)
res_var_pe[i] = output[3, 3]
active = 0.0
aux = varforecast$model$varresult
for (ii in 1:ncol(covid_data)) {
for (j in 1:length(aux[[ii]][[1]])){
if (abs(aux[[ii]][[1]][j]) > 0.0) {
active = active + 1
}
}
}
output[3, 1] = "Res. VAR(1)"
output[3, 2] = active
# Classic VAR(1) model
varforecast <- predict(VAR(covid_data[1:train_end, ], p = 1, type = 'none'), n.ahead = 1)
get_var_fcst <- function(x){return (x[1])}
var_point_fcst <- unlist(lapply(varforecast$fcst, get_var_fcst))
output[2, 2] = "$140^2$"
output[2, 3] = round(sum((var_point_fcst - covid_data[test_point, ])^2), digits = decimal_digits)
var_pe[i] = output[2, 3]
output[2, 1] = "VAR(1)"
# Sparse VAR(1) model
sparse_var_forecast <- fitVAR(covid_data[1:train_end, ], p = 1)
xhat = sparse_var_forecast$A[[1]] %*% covid_data[train_end, ]
output[4, 3] = round(sum((xhat - covid_data[test_point, ])^2), digits = decimal_digits)
sparse_var_pe[i] = output[4, 3]
b = sparse_var_forecast$A[[1]]
b[abs(b) > 0.0] = 1
output[4, 2] = sum(rowSums(b))
output[4, 1] = "Sparse VAR(1)"
##########################
# stack all ten tables
colnames(output) <- c("Model", "Active Parameters", "One-Step SPE")
print(output)
cat("\n############################################# \n#############################################\n")
if (i == 1) {
out_data = output
} else {
out_data = rbind(out_data, output)
}
k = k + 1
}
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "5.441"
## [2,] "VAR(1)" "$140^2$" "11.613"
## [3,] "Res. VAR(1)" "3821" "9.902"
## [4,] "Sparse VAR(1)" "2759" "10.109"
## [5,] "AR(1)" "$140$" "82.055"
## [6,] "GNAR(1, [1])*" "141" "5.324"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "7.86"
## [2,] "VAR(1)" "$140^2$" "12.804"
## [3,] "Res. VAR(1)" "3712" "11.504"
## [4,] "Sparse VAR(1)" "2992" "11.533"
## [5,] "AR(1)" "$140$" "78.748"
## [6,] "GNAR(1, [1])*" "141" "7.833"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "10.286"
## [2,] "VAR(1)" "$140^2$" "12.52"
## [3,] "Res. VAR(1)" "3730" "11.605"
## [4,] "Sparse VAR(1)" "3033" "13.008"
## [5,] "AR(1)" "$140$" "82.571"
## [6,] "GNAR(1, [1])*" "141" "10.173"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "12.953"
## [2,] "VAR(1)" "$140^2$" "17.762"
## [3,] "Res. VAR(1)" "3812" "14.472"
## [4,] "Sparse VAR(1)" "3283" "15.057"
## [5,] "AR(1)" "$140$" "86.781"
## [6,] "GNAR(1, [1])*" "141" "12.676"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "7.525"
## [2,] "VAR(1)" "$140^2$" "13.82"
## [3,] "Res. VAR(1)" "3761" "10.815"
## [4,] "Sparse VAR(1)" "3315" "11.995"
## [5,] "AR(1)" "$140$" "89.097"
## [6,] "GNAR(1, [1])*" "141" "7.431"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "5.088"
## [2,] "VAR(1)" "$140^2$" "12.504"
## [3,] "Res. VAR(1)" "3830" "9.976"
## [4,] "Sparse VAR(1)" "3284" "9.863"
## [5,] "AR(1)" "$140$" "88.181"
## [6,] "GNAR(1, [1])*" "141" "4.989"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "4.44"
## [2,] "VAR(1)" "$140^2$" "7.978"
## [3,] "Res. VAR(1)" "3738" "7.716"
## [4,] "Sparse VAR(1)" "3030" "7.918"
## [5,] "AR(1)" "$140$" "91.004"
## [6,] "GNAR(1, [1])*" "141" "4.466"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "3.061"
## [2,] "VAR(1)" "$140^2$" "7.196"
## [3,] "Res. VAR(1)" "3759" "5.888"
## [4,] "Sparse VAR(1)" "3017" "6.842"
## [5,] "AR(1)" "$140$" "88.066"
## [6,] "GNAR(1, [1])*" "141" "3.089"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "8.868"
## [2,] "VAR(1)" "$140^2$" "10.869"
## [3,] "Res. VAR(1)" "3772" "10.999"
## [4,] "Sparse VAR(1)" "3298" "11.773"
## [5,] "AR(1)" "$140$" "91.151"
## [6,] "GNAR(1, [1])*" "141" "8.607"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "8.623"
## [2,] "VAR(1)" "$140^2$" "15.693"
## [3,] "Res. VAR(1)" "3794" "13.258"
## [4,] "Sparse VAR(1)" "3299" "15.715"
## [5,] "AR(1)" "$140$" "96.361"
## [6,] "GNAR(1, [1])*" "141" "8.546"
##
## #############################################
## #############################################
comp_results = cbind(as.numeric(gnar_pe), as.numeric(sparse_var_pe), as.numeric(res_var_pe), as.numeric(var_pe), as.numeric(ar_pe), as.numeric(gnar_no_global_pe))
print(colMeans(comp_results))
## [1] 7.4145 11.3813 10.6135 12.2759 87.4015 7.3134
aux = out_data[out_data[, 1] == 'Sparse VAR(1)']
print(mean(as.numeric(aux[11:20])))
## [1] 3131
# 3089
aux = out_data[out_data[, 1] == 'Res. VAR(1)']
print(mean(as.numeric(aux[11:20])))
## [1] 3772.9
# 3773
all_summary = matrix(rep(0, 18), nrow = 6, ncol = 3)
all_summary[, 1] = colMeans(comp_results)
all_summary[, 2] = c(2, 2962, 3767, 19600, 140, 141)
for (j in 1:ncol(comp_results)) {
all_summary[j, 3] = sd(comp_results[, j])
}
all_summary = data.frame(" " = c("GNAR(1, [1])", "VAR(1)", "Res. VAR(1)", "Sp. VAR(1)", "AR(1)", "GNAR(1, [1])*"), "MSPE" = all_summary[, 1], "sd(MSPE)" = all_summary[, 3], "#(parameters)" = all_summary[, 2], check.names = FALSE)
knitr::kable(all_summary, "simple")
| MSPE | sd(MSPE) | #(parameters) | |
|---|---|---|---|
| GNAR(1, [1]) | 7.4145 | 2.977923 | 2 |
| VAR(1) | 11.3813 | 2.828714 | 2962 |
| Res. VAR(1) | 10.6135 | 2.482946 | 3767 |
| Sp. VAR(1) | 12.2759 | 3.184070 | 19600 |
| AR(1) | 87.4015 | 5.146954 | 140 |
| GNAR(1, [1])* | 7.3134 | 2.900672 | 141 |
For completeness, we also compare the models at the peak of the second wave.
# number of one-step ahead predictions
# dummt counter variable
k = 0
t_steps = 10
# starting point
covid_data = logMVbedMVC.vts
initial_start = 300 - t_steps
decimal_digits = 3
# AR model prediction error
ar_pe = rep(0, t_steps)
# global-alpha GNAR model prediction error
gnar_pe = rep(0, t_steps)
# local-alpha GNAR model prediction error
gnar_no_global_pe = rep(0, t_steps)
# VAR prediction error
var_pe = rep(0, t_steps)
# restricted VAR prediction error
res_var_pe = rep(0, t_steps)
# sparse VAR prediction error
sparse_var_pe = rep(0, t_steps)
We perform one-ahead forecasting ten times and save the squared error.
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "5.48"
## [2,] "VAR(1)" "$140^2$" "9.395"
## [3,] "Res. VAR(1)" "3951" "7.249"
## [4,] "Sparse VAR(1)" "3079" "8.867"
## [5,] "AR(1)" "$140$" "810.246"
## [6,] "GNAR(1, [1])*" "141" "5.659"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "3.501"
## [2,] "VAR(1)" "$140^2$" "9.027"
## [3,] "Res. VAR(1)" "4024" "7.062"
## [4,] "Sparse VAR(1)" "2793" "7.364"
## [5,] "AR(1)" "$140$" "836.603"
## [6,] "GNAR(1, [1])*" "141" "3.787"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "9.039"
## [2,] "VAR(1)" "$140^2$" "12.792"
## [3,] "Res. VAR(1)" "3985" "11.148"
## [4,] "Sparse VAR(1)" "3009" "14.796"
## [5,] "AR(1)" "$140$" "859.096"
## [6,] "GNAR(1, [1])*" "141" "9.185"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "20.778"
## [2,] "VAR(1)" "$140^2$" "30.887"
## [3,] "Res. VAR(1)" "3951" "24.208"
## [4,] "Sparse VAR(1)" "3015" "20.72"
## [5,] "AR(1)" "$140$" "863.26"
## [6,] "GNAR(1, [1])*" "141" "21.573"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "27.22"
## [2,] "VAR(1)" "$140^2$" "85.674"
## [3,] "Res. VAR(1)" "3908" "61.49"
## [4,] "Sparse VAR(1)" "3546" "31.526"
## [5,] "AR(1)" "$140$" "864.827"
## [6,] "GNAR(1, [1])*" "141" "26.732"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "14.808"
## [2,] "VAR(1)" "$140^2$" "17.403"
## [3,] "Res. VAR(1)" "3953" "9.502"
## [4,] "Sparse VAR(1)" "3244" "12.268"
## [5,] "AR(1)" "$140$" "882.507"
## [6,] "GNAR(1, [1])*" "141" "14.717"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "2.892"
## [2,] "VAR(1)" "$140^2$" "9.648"
## [3,] "Res. VAR(1)" "3983" "6.376"
## [4,] "Sparse VAR(1)" "2975" "7.352"
## [5,] "AR(1)" "$140$" "880.315"
## [6,] "GNAR(1, [1])*" "141" "3.61"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "2.246"
## [2,] "VAR(1)" "$140^2$" "10.302"
## [3,] "Res. VAR(1)" "3947" "6.152"
## [4,] "Sparse VAR(1)" "2761" "6.074"
## [5,] "AR(1)" "$140$" "887.623"
## [6,] "GNAR(1, [1])*" "141" "2.658"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "2.534"
## [2,] "VAR(1)" "$140^2$" "4.978"
## [3,] "Res. VAR(1)" "3931" "3.166"
## [4,] "Sparse VAR(1)" "3192" "6.825"
## [5,] "AR(1)" "$140$" "876.109"
## [6,] "GNAR(1, [1])*" "141" "2.387"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "3.253"
## [2,] "VAR(1)" "$140^2$" "8.294"
## [3,] "Res. VAR(1)" "3999" "5.584"
## [4,] "Sparse VAR(1)" "3762" "9.198"
## [5,] "AR(1)" "$140$" "886.691"
## [6,] "GNAR(1, [1])*" "141" "3.622"
##
## #############################################
## #############################################
## [1] 9.1751 12.4990 14.1937 19.8400 864.7277 9.3930
## [1] 3137.6
## [1] 3963.2
| MSPE | sd(MSPE) | #(parameters) | |
|---|---|---|---|
| GNAR(1, [1]) | 9.1751 | 8.847525 | 2 |
| VAR(1) | 12.4990 | 8.064765 | 2962 |
| Res. VAR(1) | 14.1937 | 17.604062 | 3767 |
| Sp. VAR(1) | 19.8400 | 24.236849 | 19600 |
| AR(1) | 864.7277 | 24.694392 | 140 |
| GNAR(1, [1])* | 9.3930 | 8.717437 | 141 |
We compare three-step ahead squared prediction error at the end of the realisation, and at the start of the second wave. The results for the end of the realisation follow in the table below.
covid_data = logMVbedMVC.vts
t_steps = 3
initial_start = 452 - t_steps
decimal_digits = 3
train_end = initial_start
test_point = train_end + 1
output <- matrix(rep(0, 18), nrow = 6, ncol = 3)
# We compare the GNAR models in this script
# 140 AR(1) models
arforecast <- apply(covid_data[1:train_end, ], 2, function(x){
forecast(auto.arima(x, d = 0, D = 0, max.p = 1, max.q = 0,
max.P = 0, max.Q = 0, stationary = TRUE, seasonal = FALSE, ic = 'bic',
allowmean = FALSE, allowdrift = FALSE, trace = FALSE), h = 3)$mean
})
one_step_ahead_ar = round(sum((arforecast[1, ] - covid_data[test_point, ])^2), decimal_digits)
two_step_ahead_ar = round(sum((arforecast[1, ] - covid_data[test_point + 1, ])^2), decimal_digits)
three_step_ahead_ar = round(sum((arforecast[1, ] - covid_data[test_point + 2, ])^2), decimal_digits)
ar_spde_three_steps = c(one_step_ahead_ar, two_step_ahead_ar, three_step_ahead_ar)
# GNAR(1, [1]) global-alpha fit
covid_fit <- GNARfit(covid_data[1:train_end, ], net = NHSTrustMVCAug120.net, alphaOrder = 1, betaOrder = rep(1, 1))
three_step_ahead_predictions = predict(covid_fit, 3)
one_step_ahead_gnar_global = round(sum((three_step_ahead_predictions[1, ] - covid_data[test_point, ])^2), digits = decimal_digits)
two_step_ahead_gnar_global = round(sum((three_step_ahead_predictions[2, ] - covid_data[test_point + 1, ])^2), digits = decimal_digits)
three_step_ahead_gnar_global = round(sum((three_step_ahead_predictions[3, ] - covid_data[test_point + 2, ])^2), digits = decimal_digits)
gnar_global_spde_three_steps = c(one_step_ahead_gnar_global, two_step_ahead_gnar_global, three_step_ahead_gnar_global)
# GNAR(1, [1]) local-alpha fit
covid_fit_local <- GNARfit(covid_data[1:train_end, ], net = NHSTrustMVCAug120.net, alphaOrder = 1, betaOrder = rep(1, 1), globalalpha=FALSE)
three_step_ahead_predictions = predict(covid_fit_local, 3)
one_step_ahead_gnar_local = round(sum((three_step_ahead_predictions[1, ] - covid_data[test_point, ])^2), digits = decimal_digits)
two_step_ahead_gnar_local = round(sum((three_step_ahead_predictions[2, ] - covid_data[test_point + 1, ])^2), digits = decimal_digits)
three_step_ahead_gnar_local = round(sum((three_step_ahead_predictions[3, ] - covid_data[test_point + 2, ])^2), digits = decimal_digits)
gnar_local_spde_three_steps = c(one_step_ahead_gnar_local, two_step_ahead_gnar_local, three_step_ahead_gnar_local)
# Restricted VAR(1) model
res_varforecast <- predict(restrict(VAR(covid_data[1:train_end, ], p = 1, type = 'none')), n.ahead = 3)
# Vectorise model output and predictions
res_var_one_step_ahead <- function(x){return (x[1])}
res_var_two_step_ahead <- function(x){return (x[2])}
res_var_three_step_ahead <- function(x){return (x[3])}
res_var_point_one_step_ahead <- unlist(lapply(res_varforecast$fcst, res_var_one_step_ahead))
res_var_point_two_step_ahead <- unlist(lapply(res_varforecast$fcst, res_var_two_step_ahead))
res_var_point_three_step_ahead <- unlist(lapply(res_varforecast$fcst, res_var_three_step_ahead))
#Compute mean squared prediction error
res_var_mspe_one_step = round(sum((res_var_point_one_step_ahead - covid_data[test_point, ])^2), digits = decimal_digits)
res_var_mspe_two_step = round(sum((res_var_point_two_step_ahead - covid_data[test_point + 1, ])^2), digits = decimal_digits)
res_var_mspe_three_step = round(sum((res_var_point_three_step_ahead - covid_data[test_point + 2, ])^2), digits = decimal_digits)
res_var_spde_three_steps = c(res_var_mspe_one_step, res_var_mspe_two_step, res_var_mspe_three_step)
# Classic VAR(1) model
varforecast <- predict(VAR(covid_data[1:train_end, ], p = 1, type = 'none'), n.ahead = 3)
# Vectorise model output and predictions
var_one_step_ahead <- function(x){return (x[1])}
var_two_step_ahead <- function(x){return (x[2])}
var_three_step_ahead <- function(x){return (x[3])}
var_point_one_step_ahead <- unlist(lapply(varforecast$fcst, var_one_step_ahead))
var_point_two_step_ahead <- unlist(lapply(varforecast$fcst, var_two_step_ahead))
var_point_three_step_ahead <- unlist(lapply(varforecast$fcst, var_three_step_ahead))
#Compute mean squared prediction error
var_mspe_one_step = round(sum((var_point_one_step_ahead - covid_data[test_point, ])^2), digits = decimal_digits)
var_mspe_two_step = round(sum((var_point_two_step_ahead - covid_data[test_point + 1, ])^2), digits = decimal_digits)
var_mspe_three_step = round(sum((var_point_three_step_ahead - covid_data[test_point + 2, ])^2), digits = decimal_digits)
var_spde_three_steps = c(var_mspe_one_step, var_mspe_two_step, var_mspe_three_step)
# Sparse VAR(1) model
sparse_var_forecast <- fitVAR(covid_data[1:train_end, ], p = 1)
one_step_ahead_svar = sparse_var_forecast$A[[1]] %*% covid_data[train_end, ]
two_step_ahead_svar = sparse_var_forecast$A[[1]] %*% one_step_ahead_svar
three_step_ahead_svar = sparse_var_forecast$A[[1]] %*% two_step_ahead_svar
svar_mspe_one_step = round(sum((one_step_ahead_svar - covid_data[test_point, ])^2), digits = decimal_digits)
svar_mspe_two_step = round(sum((two_step_ahead_svar - covid_data[test_point + 1, ])^2), digits = decimal_digits)
svar_mspe_three_step = round(sum((three_step_ahead_svar - covid_data[test_point + 2, ])^2), digits = decimal_digits)
svar_spde_three_steps = c(svar_mspe_one_step, svar_mspe_two_step, svar_mspe_three_step)
##########################
spde_table = data.frame("St. Ahead" = c("one", "two", "three"), "AR(1)" = ar_spde_three_steps, "GNAR(1, [1])" = gnar_global_spde_three_steps, "GNAR(1, [1])*" = gnar_local_spde_three_steps,
"Res. VAR(1)" = res_var_spde_three_steps, "VAR(1)" = var_spde_three_steps, "Sp. VAR(1)" = svar_spde_three_steps,
check.names = FALSE)
knitr::kable(spde_table, 'simple')
| St. Ahead | AR(1) | GNAR(1, [1]) | GNAR(1, [1])* | Res. VAR(1) | VAR(1) | Sp. VAR(1) |
|---|---|---|---|---|---|---|
| one | 88.066 | 3.061 | 3.089 | 5.888 | 7.196 | 6.842 |
| two | 91.657 | 11.620 | 11.106 | 14.726 | 15.310 | 18.482 |
| three | 102.989 | 19.273 | 18.180 | 23.780 | 25.456 | 31.739 |
The comparison at the start of the second wave, i.e., \(T = 300\) is given below.
| St. Ahead | AR(1) | GNAR(1, [1]) | GNAR(1, [1])* | Res. VAR(1) | VAR(1) | Sp. VAR(1) |
|---|---|---|---|---|---|---|
| one | 887.623 | 2.246 | 2.658 | 6.152 | 10.302 | 6.599 |
| two | 886.906 | 5.054 | 4.145 | 9.964 | 13.527 | 10.772 |
| three | 881.981 | 7.984 | 6.264 | 13.769 | 14.843 | 16.757 |
For completeness we also compare GNAR with a conditional autoregressive temporal autoregressive (CARar) model. Mean squared prediction error (MSPE) comparisons, including standard deviation (sd. MSPE) follow in the Table below. Interestingly, both models perform similarly. It turns out that CARar models can be seen as GNAR models constrained to one-stage neighbourhood regressions. Computation for CARar requires expensive MCMC calls, thus, for brevity we summarise the results below; see https://github.com/dansal182/new_tools_for_network_time_series_with_application_to_COVID_19_hospitalisations for the experiments code.
| GNAR(1, [1]) | sp. VAR | res. VAR | VAR | CARar | GNARc | Naive | |
|---|---|---|---|---|---|---|---|
| MSPE | 7.40 | 11.4 | 10.60 | 12.30 | 7.5 | 7.20 | 7.60 |
| sd. MSPE | 2.98 | 2.8 | 2.48 | 3.18 | 3.1 | 2.82 | 3.12 |
| #(parameters) | 2.00 | 3089.0 | 3773.00 | 19600.00 | 5.0 | 2.00 | 0.00 |
Above, GNARc denotes GNAR model with centred columns, i.e., we subtract the column mean before fitting the model. We also remark the difference in execution time (in seconds):
| Total time | Average time | |
|---|---|---|
| GNAR(1, [1]) | 6.2 | 0.62 |
| CARar(1) | 7821.9 | 782.19 |
Execution time comparison between global-\(\alpha\) GNAR models and CARar models. The total execution time is the time computing one-step ahead forecasts ten times (i.e., the forecasts used for producing MSPE Table) takes in seconds. The average time is the total time divided by ten.
We remark that \(n\)-step ahead forecasting is computationally cumbersome with CAR models. Unfortunately, the \(\texttt{CARBayesST}\) package does not provide a forecasting method. However, these forecasts can be computed by treating the estimated model as a VAR process, which resembles GNAR\((1, [1])\) models; see the supplementary material for details.
We analyse the impact individual nodes have on network autocorrelation for our choice of model order. Further, we study the relative importance each node has in pair-wise node regressions, and compare the one-lag cross-correlation matrix with a heat map obtained from Theorem 2, this comparison shows the self-similar clustering behaviour for pair-wise correlations.
We start by defining the different waves of the pandemic and looking at a R-Corbit plot of the realised network autocorrelation.
covid_data = logMVbedMVC.vts
W_nhs_network = weights_matrix(NHSTrustMVCAug120.net, 6)
first_wave = covid_data[1:100, ]
gap = covid_data[101:177, ]
second_wave = covid_data[178:447, ]
time_slices = c('First Wave: 04-2020 -> 07-2020', 'Gap: 07-2020 -> 09-2020', 'Second Wave: 09-2020 -> 07-2021')
R_corbit_plot(list(first_wave, gap, second_wave), list(NHSTrustMVCAug120.net), 10, 6, weight_matrices = list(W_nhs_network), same_net = 'yes',
frame_names = time_slices)
The PNACF R-Corbit plot is produced with the code below.
R_corbit_plot(list(first_wave, gap, second_wave), list(NHSTrustMVCAug120.net), 10, 6, weight_matrices = list(W_nhs_network), same_net = 'yes',
frame_names = time_slices, partial = 'yes')
We explore the importance individual nodes have for predicting each other with the heat-map below. Note that it is not symmetric, the yellow colour highlights the strongest pair-wise realations according to formula (18) in \(\textbf{New tools for network time series with an application to COVID-19 hospitalisations}\).
active_node_plot(second_wave, NHSTrustMVCAug120.net, 1, c(1))
We continue by looking at the impact each node has on the total network autocorrelation (i.e., the column in \(\mathbf{W}\) that dominates autocorrelation). We do this by producing the node relevance hierarchy and plot, which is computed using (17).
node_relevance_plot(NHSTrustMVCAug120.net, 1, colnames(covid_data), 1)
## [[1]]
##
## [[2]]
## Node Relevance
## 1 RGP 0.2388075
## 2 REF 0.2590320
## 3 RNN 0.3423592
## 4 RM1 0.3481702
## 5 RJL 0.3482829
## 6 RWA 0.4361727
## 7 RVV 0.4489792
## 8 RXC 0.4656740
## 9 RLQ 0.4718158
## 10 RCX 0.4981928
## 11 R1F 0.5329288
## 12 RTF 0.5451035
## 13 RTD 0.5607670
## 14 RWD 0.5672991
## 15 RXH 0.5723261
## 16 R0B 0.5776501
## 17 RR7 0.5914945
## 18 RBD 0.6028521
## 19 RWP 0.6045146
## 20 R0D 0.6237309
## 21 RDY 0.6300686
## 22 RBL 0.6343268
## 23 RVY 0.6360888
## 24 RYR 0.6388915
## 25 RBZ 0.6407169
## 26 REN 0.6555618
## 27 RXW 0.6644188
## 28 RTQ 0.6657547
## 29 RXL 0.6672037
## 30 RD1 0.6738285
## 31 RAJ 0.6772388
## 32 RA7 0.6775857
## 33 REM 0.6792473
## 34 RNZ 0.6839019
## 35 RQ3 0.6840216
## 36 RRK 0.6843320
## 37 RET 0.6886182
## 38 RRJ 0.6945536
## 39 RVJ 0.6959880
## 40 RKB 0.6985588
## 41 RHU 0.7060903
## 42 RJR 0.7124531
## 43 RWF 0.7136386
## 44 RH8 0.7216425
## 45 RLT 0.7233334
## 46 RBS 0.7266070
## 47 RJC 0.7274836
## 48 RBQ 0.7277001
## 49 RWE 0.7277560
## 50 RXN 0.7293355
## 51 RDE 0.7317239
## 52 RGN 0.7323235
## 53 RNA 0.7341533
## 54 RPA 0.7402341
## 55 RTR 0.7405527
## 56 RK9 0.7428719
## 57 RXK 0.7429187
## 58 RVW 0.7525139
## 59 RGR 0.7581291
## 60 RBN 0.7648500
## 61 RP5 0.7679661
## 62 RL4 0.7741657
## 63 RHM 0.7741876
## 64 RTX 0.7802541
## 65 RXE 0.7804340
## 66 RFR 0.7804340
## 67 RRF 0.7837088
## 68 RA9 0.7865553
## 69 RH5 0.7924808
## 70 RN7 0.7947044
## 71 RNQ 0.7962871
## 72 RP1 0.7974614
## 73 RBK 0.7993277
## 74 RXP 0.8151463
## 75 RF4 0.8178328
## 76 RTP 0.8213661
## 77 RA4 0.8273467
## 78 RXR 0.8290119
## 79 RJ2 0.8296417
## 80 RJ6 0.8388167
## 81 RCB 0.8419626
## 82 RMC 0.8420556
## 83 RWW 0.8421183
## 84 R1H 0.8458930
## 85 RJ1 0.8477358
## 86 RQX 0.8503438
## 87 RAP 0.8514368
## 88 RP4 0.8529093
## 89 RJZ 0.8534630
## 90 RRV 0.8548508
## 91 RTK 0.8587055
## 92 RJ7 0.8592502
## 93 RKE 0.8602540
## 94 RAL 0.8605143
## 95 RK5 0.8607855
## 96 RNS 0.8640082
## 97 RX1 0.8650822
## 98 RVR 0.8651983
## 99 RPY 0.8655824
## 100 RQM 0.8663747
## 101 RYJ 0.8680895
## 102 RWH 0.8682384
## 103 RT1 0.8695644
## 104 RR8 0.8723391
## 105 RGT 0.8759949
## 106 RGM 0.8763338
## 107 RAX 0.8783572
## 108 RC9 0.8839073
## 109 RXF 0.8857286
## 110 RA2 0.8887197
## 111 RAN 0.8925245
## 112 RM3 0.8945031
## 113 RQW 0.8950113
## 114 R1K 0.8981421
## 115 RW6 0.8999965
## 116 RAE 0.9023903
## 117 RAS 0.9059743
## 118 RDU 0.9060968
## 119 RTG 0.9105164
## 120 RWG 0.9174765
## 121 RN5 0.9209021
## 122 RHQ 0.9258074
## 123 RBT 0.9285356
## 124 RD8 0.9324912
## 125 RWY 0.9352884
## 126 RCF 0.9354989
## 127 RFF 0.9416439
## 128 RCU 0.9437173
## 129 R0A 0.9490081
## 130 RFS 0.9523243
## 131 RTH 0.9524462
## 132 RNU 0.9609052
## 133 RJE 0.9688577
## 134 RN3 0.9700914
## 135 RMP 0.9724912
## 136 RCD 0.9729392
## 137 RHW 0.9736950
## 138 RWJ 0.9790502
## 139 RJN 0.9880315
## 140 RXQ 1.0000000
We compare the observed one-lag cross-correlation and the structure Theorem 2 suggests in the plots below. Note that the neighbourhood and self-similarity suggests that outbreaks are mostly dependent on direct neighbours, i.e., the virus cannot reach nodes by jumps it must traverse all NHS trusts (i.e., nodes) in a path between two places for instigating an outbreak.
# larger r-stage changes the correlation structure
max_r_stage = 1
local_relevance_plot(NHSTrustMVCAug120.net, max_r_stage)
cross_correlation_plot(1, covid_data)
All three plots, except for the node relevance one, have the following column and row names.
dummy_vec = vapply(c(1:140), function(x) {cat(paste0(x, ": ", colnames(covid_data)[x]), "\n"); return (0)}, 0)
## 1: RCF
## 2: RBS
## 3: RTK
## 4: RF4
## 5: RFF
## 6: R1H
## 7: RC9
## 8: RQ3
## 9: RXL
## 10: RMC
## 11: RAE
## 12: RXH
## 13: RXQ
## 14: RWY
## 15: RGT
## 16: RT1
## 17: RQM
## 18: RFS
## 19: RJR
## 20: RXP
## 21: RJ6
## 22: RN7
## 23: RP5
## 24: RBD
## 25: RDY
## 26: RJN
## 27: RVV
## 28: RXR
## 29: RDE
## 30: RXC
## 31: RWH
## 32: RVR
## 33: RDU
## 34: RR7
## 35: RLT
## 36: RTQ
## 37: RP4
## 38: RN3
## 39: RJ1
## 40: RN5
## 41: RCD
## 42: RQX
## 43: RWA
## 44: RYJ
## 45: R1F
## 46: RGP
## 47: RNQ
## 48: RJZ
## 49: RAX
## 50: RXN
## 51: RR8
## 52: RJ2
## 53: RBQ
## 54: REM
## 55: R1K
## 56: RWF
## 57: R0A
## 58: RPA
## 59: RBT
## 60: RXF
## 61: RAJ
## 62: RD8
## 63: RM1
## 64: RVJ
## 65: RNN
## 66: RAP
## 67: RVW
## 68: RGN
## 69: RNS
## 70: RP1
## 71: RBZ
## 72: RJL
## 73: RTF
## 74: RX1
## 75: RNU
## 76: RTH
## 77: RW6
## 78: RHU
## 79: RXE
## 80: RHW
## 81: REF
## 82: RH8
## 83: RAL
## 84: RAN
## 85: RGM
## 86: RA2
## 87: RD1
## 88: RM3
## 89: RNZ
## 90: RXK
## 91: RCU
## 92: RHQ
## 93: RK5
## 94: RH5
## 95: RTR
## 96: R0B
## 97: RJC
## 98: RVY
## 99: RJ7
## 100: RBN
## 101: RWJ
## 102: RTP
## 103: RMP
## 104: REN
## 105: RNA
## 106: RAS
## 107: RTD
## 108: RQW
## 109: RCX
## 110: RFR
## 111: RPY
## 112: RRJ
## 113: RL4
## 114: RXW
## 115: RET
## 116: RA9
## 117: RWD
## 118: RRV
## 119: RHM
## 120: RRK
## 121: RA7
## 122: RKB
## 123: R0D
## 124: RK9
## 125: RTG
## 126: RWE
## 127: RTX
## 128: RJE
## 129: RBK
## 130: RWW
## 131: RWG
## 132: RGR
## 133: RYR
## 134: RKE
## 135: RBL
## 136: RWP
## 137: RRF
## 138: RLQ
## 139: RA4
## 140: RCB
We plot the network and colour the nodes according to node relevance given by (18).
For comparison purposes, and highlighting that node relevance depends on the network time series dynamics. We also plot the node relevance for \(r^* = 3\) (i.e., largest shortest path equal to three), and \(r^* = 6\) (i.e., largest shortest path equal to six).